Distributed Information Retrieval Based on Hierarchical Semantic Overlay Network

نویسندگان

  • Fei Liu
  • Fanyuan Ma
  • Minglu Li
  • Linpeng Huang
چکیده

One fundamental problem that confronts information retrieval is to efficiently support query with higher accuracy and less logic hops. This paper presents HSPIR (Hierarchical Semantic P2P-based Information Retrieval) that distributes document indices through the P2P network hierarchically based on documents semantics generated by Latent Semantic Indexing (LSI) [1]. HSPIR uses CAN [2] and Range Addressable network organize [3] nodes into a hierarchical overlay network. Comparing with other P2P search techniques [4, 5] those are based on simple keyword matching, HSPIR has better accuracy for it considers the advanced relevance among documents. We use Agglomerative Information Bottleneck (AIB) [6] to cluster documents and train Directed Acyclic Graph Support Vector Machines (DAGSVM) based on these clustered documents. Owning to the hierarchical overlay network, the average number of logical hops per query is smaller than other flat architectures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

A Hierarchical Semantic Overlay for P2P Search

In this paper, we propose a hierarchical semantic overlay network for searching heterogeneous data over wide-area networks. In this system, data are represented as RDF triples based on ontologies. Peers that have the same semantics are organized into a semantic cluster, and the semantic clusters are self-organized into a one-dimensional ring space to form the toplevel semantic overlay network. ...

متن کامل

Hierarchical Query Routing in P2P Information Filtering Systems

In recent years, various P2P network applications, Napster, Gnutella, WinMX, Winny and many others, have been developed. But it is difficult to find out suitable information resources and to guarantee practical response time by using distributed systems with simple search functions. Furthermore, most of applications are based on overlay networks with unstable network topology and query messages...

متن کامل

SOON: A Scalable Self-organized Overlay Network for Distributed Information Retrieval

Locating desirable resources and information from a large-scale distributed system such as P2P system and grid is a very important issue. However, the distributed, heterogeneous, and unstructured nature of the system makes this issue very challenging. In this paper, we propose Self-Organized Overlay Network (SOON), an unstructured P2P overlay architecture, to facilitate sharing and searching se...

متن کامل

A DHT-based semantic overlay network for service discovery

The number of available Internet services increases every day. This trend demands distributed models and architectures to support scalability as well as semantics to enable efficient publication and retrieval of services. Two common approaches toward this goal are semantic overlay networks (SONs) and distributed hash tables (DHTs) with semantic extensions. SONs enable semantic-driven query answ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004